14 research outputs found

    Progressive insular cooperative genetic programming algorithm for multiclass classification

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsIn contrast to other types of optimisation algorithms, Genetic Programming (GP) simultaneously optimises a group of solutions for a given problem. This group is named population, the algorithm iterations are named generations and the optimisation is named evolution as a reference o the algorithm’s inspiration in Darwin’s theory on the evolution of species. When a GP algorithm uses a one-vs-all class comparison for a multiclass classification (MCC) task, the classifiers for each target class (specialists) are evolved in a subpopulation and the final solution of the GP is a team composed of one specialist classifier of each class. In this scenario, an important question arises: should these subpopulations interact during the evolution process or should they evolve separately? The current thesis presents the Progressively Insular Cooperative (PIC) GP, a MCC GP in which the level of interaction between specialists for different classes changes through the evolution process. In the first generations, the different specialists can interact more, but as the algorithm evolves, this level of interaction decreases. At a later point in the evolution process, controlled through algorithm parameterisation, these interactions can be eliminated. Thus, in the beginning of the algorithm there is more cooperation among specialists of different classes, favouring search space exploration. With elimination of cooperation, search space exploitation is favoured. In this work, different parameters of the proposed algorithm were tested using the Iris dataset from the UCI Machine Learning Repository. The results showed that cooperation among specialists of different classes helps the improvement of classifiers specialised in classes that are more difficult to discriminate. Moreover, the independent evolution of specialist subpopulations further benefits the classifiers when they already achieved good performance. A combination of the two approaches seems to be beneficial when starting with subpopulations of differently performing classifiers. The PIC GP also presented great performance for the more complex Thyroid and Yeast datasets of the same repository, achieving similar accuracy to the best values found in literature for other MCC models.Diferente de outros algoritmos de otimiação computacional, o algoritmo de Programação Genética PG otimiza simultaneamente um grupo de soluções para um determinado problema. Este grupo de soluções é chamado população, as iterações do algoritmo são chamadas de gerações e a otimização é chamada de evolução em alusão à inspiração do algoritmo na teoria da evolução das espécies de Darwin. Quando o algoritmo GP utiliza a abordagem de comparação de classes um-vs-todos para uma classificação multiclasses (CMC), os classificadores específicos para cada classe (especialistas) são evoluídos em subpopulações e a solução final do PG é uma equipe composta por um especialista de cada classe. Neste cenário, surge uma importante questão: estas subpopulações devem interagir durante o processo evolutivo ou devem evoluir separadamente? A presente tese apresenta o algoritmo Cooperação Progressivamente Insular (CPI) PG, um PG CMC em que o grau de interação entre especialistas em diferentes classes varia ao longo do processo evolutivo. Nas gerações iniciais, os especialistas de diferentes classes interagem mais. Com a evolução do algoritmo, estas interações diminuem e mais tarde, dependendo da parametriação do algoritmo, elas podem ser eliminadas. Assim, no início do processo evolutivo há mais cooperação entre os especialistas de diferentes classes, o que favorece uma exploração mais ampla do espaço de busca. Com a eliminação da cooperação, favorece-se uma exploração mais local e detalhada deste espaço. Foram testados diferentes parâmetros do PG CPl utilizando o conjunto de dados iris do UCI Machine Learning Repository. Os resultados mostraram que a cooperação entre especialistas de diferentes classes ajudou na melhoria dos classificadores de classes mais difíceis de modelar. Além disso, que a evolução sem a interação entre as classes de diferentes especialidades beneficiou os classificadores quando eles já apresentam boa performance Uma combinação destes dois modos pode ser benéfica quando o algoritmo começa com classificadores que apresentam qualidades diferentes. O PG CPI também apresentou ótimos resultados para outros dois conjuntos de dados mais complexos o thyroid e o yeast, do mesmo repositório, alcançando acurácia similar aos melhores valores encontrados na literatura para outros modelos de CMC

    Multi-Algorithm Clustering Analysis for Characterizing Cow Productivity on Automatic Milking Systems Over Lactation Periods

    Get PDF
    Rebuli, K. B., Ozella, L., Vanneschi, L., & Giacobini, M. (2023). Multi-Algorithm Clustering Analysis for Characterizing Cow Productivity on Automatic Milking Systems Over Lactation Periods. Computers And Electronics In Agriculture, 211(August 2023), [108002]. https://doi.org/10.2139/ssrn.4435365, https://doi.org/10.1016/j.compag.2023.108002---This study is supported by Compagnia di San Paolo (ROL 63369 SIME 2020.1713) and by national funds through FCT (Fundação para a Ciência e a Tecnologia), under the project - UIDB/04152/2020 - Centro de Investigação em Gestão de Informação (MagIC)/NOVA IMSThis study proposes a novel approach for characterizing the milk productivity patterns of cows milked by Automatic Milking Systems (AMSs) within each lactation period, and for assessing its stability over time. AMSs enable real-time monitoring of udder health and milk quality during each milking episode, leading to an increasing amount of data that can be exploited to optimize herd management. Machine Learning (ML) algorithms are suitable for such situations, as they can handle multi-dimensional, heterogeneous, and large datasets. The methodology presented in this work used four clustering algorithms as unsupervised ML methods to cluster the cows within each lactation period. The clusters were grouped according to their productivity, and a merging index was defined to combine the clustering outcomes into a univocal result. The stability of the Productivity Groups (PGs) over time was analyzed. The proposed methodology was demonstrated using data from one farm with Holstein Friesians cows that exclusively uses the AMS. The PGs were found to be weakly stable over time, indicating that selecting cows for insemination based solely on their present or past lactation productivity may not be the most effective strategy. The study proposes using the same cows over all lactation periods to better understand the defining factors and dynamics of the PGs. Overall, the proposed framework provides a valuable tool for characterizing productivity groups and improving herd management practices in dairy farming.preprintepub_ahead_of_prin

    Avaliaçao do potencial nutritivo do seston para maricultura de mar aberto no Estado do Paraná, BR

    No full text
    Orientador: Frederico Pereira BrandiniMonografia (Bacharelado) - Universidade Federal do Paraná, Setor de Ciencias da Terra, Centro de Estudos do Mar, Curso de Graduaçao em OceanografiaA maricultura de mar aberto destaca-se no cenário internacional como uma nova fronteira de utilização do espaço marinho para produção de alimento, sem os conflitos sócio-ambientais da maricultura em áreas costeiras protegidas. Esse trabalho objetivou avaliar a qualidade do seston como alimento para moluscos filtradores na plataforma rasa do Paraná, entre as isóbatas de 10 e 30 m. Foram obtidos dados da hidrografia, clorofila a e fotossintese e amostras de água de superficie, meio e fundo para análises de seston, matéria orgânica particulada (MOP), carbono orgânico particulado (COP) e carbono do fitoplancton (Cro). A hidrografia revelou a formação da termoclina sazonal no verão e uma coluna d' água homogênea no inverno. As concentrações clorofila a foram maiores na superficie e no fundo e foram registrados máximos subsuperficiais de clorofila a no verão. A concentração do seston variou de 8,8 a 26,5 mg., com as maiores concentrações no fundo das áreas costeiras. A MOP e o COP apresentaram uma tendência de decrescer da área mais próxima para a mais afastada da costa e também estiveram mais concentrados no fundo. Suas concentrações variaram respectivamente de 0,1 a 2,17 mg.l' de 129,3 a 531,1 ug.l'. O Cfito variou de 1,07 a 54,8 µg.l' e esteve mais concentrado no fundo. A fração orgânica do seston foi baixa, de no máximo 8%. A qualidade da MOP, avaliada pela contribuição do Co para o COP, aumentou em direção ao fundo e à região mais afastada da costa, indicando que a maricultura de mar aberto no Paraná é viável, do ponto de vista da disponibilidade de alimento para moluscos filtradores, desde que os cultivos sejam montados próximo ao fundo e nas áreas mais afastadas da costa. Palavras-chave: maricultura de mar aberto; seston; matéria orgânica particulada; carbono orgânico particulado; fitoplàncton; plataforma rasa

    Progressive Insular Cooperative GP

    No full text
    Brotto Rebuli, K., & Vanneschi, L. (2021). Progressive Insular Cooperative GP. In T. Hu, N. Lourenço, & E. Medvet (Eds.), Genetic Programming: 24th European Conference, EuroGP 2021, Held as Part of EvoStar 2021, Virtual Event, April 7–9, 2021, Proceedings (pp. 19-35). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 12691 LNCS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-72812-0_2This work presents a novel genetic programming system for multi-class classification, called progressively insular cooperative genetic programming (PIC GP). Based on the idea that effective multiclass classification can be obtained by appropriately joining classifiers that are highly specialized on the single classes, PIC GP evolves, at the same time, two populations. The first population contains individuals called specialists, and each specialist is optimized on one specific target class. The second population contains higher-level individuals, called teams, that join specialists to obtain the final algorithm prediction. By means of three simple parameters, PIC GP can tune the amount of cooperation between specialists of different classes. The first part of the paper is dedicated to a study of the influence of these parameters on the evolution dynamics. The obtained results indicate that PIC GP achieves the best performance when the evolution begins with a high level of cooperation between specialists of different classes, and then this type of cooperation is progressively decreased, until only specialists of the same class can cooperate between each other. The last part of the work is dedicated to an experimental comparison between PIC GP and a set of state-of-the-art classification algorithms. The presented results indicate that PIC GP outperforms the majority of its competitors on the studied test problems.authorsversionpublishe

    A Comparison of Structural Complexity Metrics for Explainable Genetic Programming [Poster]

    Get PDF
    Rebuli, K. B., Giacobini, M., Silva, S., & Vanneschi, L. (2023). A Comparison of Structural Complexity Metrics for Explainable Genetic Programming [Poster]. In S. Silva, & L. Paquete (Eds.), GECCO '23 Companion: Proceedings of the Companion Conference on Genetic and Evolutionary Computation (pp. 539–542). Association for Computing Machinery (ACM). https://doi.org/10.1145/3583133.3590595 --- This work was partially supported by FCT, Portugal, through funding of research units MagIC/NOVA IMS (UIDB/04152/2020) and LASIGE (UIDB/00408/2020 and UIDP/00408/2020).Genetic Programming (GP) has the potential to generate intrinsically explainable models. Despite that, in practice, this potential is not fully achieved because the solutions usually grow too much during the evolution. The excessive growth together with the functional and structural complexity of the solutions increase the computational cost and the risk of overfitting. Thus, many approaches have been developed to prevent the solutions to grow excessively in GP. However, it is still an open question how these approaches can be used for improving the interpretability of the models. This article presents an empirical study of eight structural complexity metrics that have been used as evaluation criteria in multi-objective optimisation. Tree depth, size, visitation length, number of unique features, a proxy for human interpretability, number of operators, number of non-linear operators and number of consecutive nonlinear operators were tested. The results show that potentially the best approach for generating good interpretable GP models is to use the combination of more than one structural complexity metric.publishersversionpublishe
    corecore